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CHA.PTER  I 


INTRODUCTION 


Problem 


The  purpose  of  this  study  was  to  determine  whether  or  not  the 
experimentally  established  attitudes  of  the  examiner  during  the  admin- 
istration of  an  intelligence  test  could  influence  the  subject's  test 
performance.  More  specifically,  the  experimenter  wished  to  evaluate 
the  effects  of  a  Positive  Administration  (i.e.,  an  administration  char- 
acterized by  a  warm  and  highly  approving,  interested  manner  of  test 
administration)  and  of  a  Negative  Administration  (i.e.,  an  administration 
characterized  by  a  persistently  rejecting  and  disinterested  manner  of 
test  administration) . 

The  study  was  concerned  mainly  with  the  examiner  variable  in 
the  psychologist- subject  relationship  in  intelligence  testing.  The 
experimenter  hoped  to  amplify  the  scant  amount  of  research  in  this  area 
by  demonstrating  a  significant  difference  in  the  subject's  intelligence 
test  performance  as  a  result  of  the  administration  experience  of  two 
artificial,  but  possible,  conditions  of  the  examiner  attitude. 

Moreover,  in  order  to  understand  more  fully  the  interaction 
between  the  examiner  and  the  subject,  the  experimenter  was  interested 
in  identifying,  through  the  use  of  a  i^ting  scale,  aspects  of  examiner 
performance  which  are  present  in  one  form  of  administration  and  not 
present  in  the  other. 


Development  of  the  Problem 

Consideration  of  interpersonal  and  situational  variables  in 

psychological  testing  has  been  a  matter  of  major  concern  for  about  a 

dozen  years.  Joel  (1949)  cogently  states: 

It  has  been  argued  that  the  effect  of  the  examiner  should  be 
reduced  to  a  minimum  by  assiaming  a  constant  warm  attitude  which 
does  not  change  under  the  Impact  of  the  subject's  personality. 
There  can  be  no  argument  about  the  desirability  of  a  friendly 
attitude  on  the  part  of  the  examiner  or  about  its  beneficial  ef- 
fect on  rapport.  But  we  make  a  fundamental  mistake  if  we  believe 
the  assumed  attitude  of  the  examiner  can  really  nullify  the  dynam- 
ics of  the  testing  situation.  Clinical  psychologists  are  human, 
and  the  assumed  attitude  of  warmth  notwithstanding,  they  react  to 
different  subjects  in  different  ways,  partly  because  of  irrational 
attitudes.  So  we  must  reckon  with  the  effect  of  the  examiner's 
actual  continuously  changing  feelings  underneath  the  assumed 
attitudes. 

More  recently,  this  matter  is  again  touched  upon  by  Cronbach 

(1960,  p.  60),  who  writesi 

The  tester  has  been  accustomed  to  think  of  himself  as  an  unemotional, 
impartial  task-setter.  His  traditions  encourage  the  idea  that  he, 
like  the  physical  scientist  or  engineer,  is  "measiiring  an  object" 
with  a  technical  tool.  But  the  "object"  before  him  is  a  person, 
and  testing  involves  a  complex  psychological  relationship.  The 
traditional  concern  with  motivation  and  rapp)ort  recognizes  this 
fact  but  .  .  .  leads  to  little  more  than  a  recommendation  that  the 
tester  be  pleasant  and  encouraging,  and  help  the  subject  under- 
stand the  value  of  the  test.  This,  we  are  beginning  to  suspect, 
barely  touches  the  real  social-psychological  complexities  of 
testing. 

A  recent  review  of  the  literature  dealing  with  the  influence  of 
situational  and  interpersonal  variables  in  projective  testing  (Masling, 
1960)  cites  over  seventy  references  dealing  with  the  method  of  admin- 
istration, the  testing  situation,  examiner  Influence,  and  subject  in- 
fluence. Generally  speaking,  the  crucial  element  in  modifying  the 
subject's  projective  response  was  the  extent  to  which  the  subject's 
attitude  toward  the  total  testing  situation  was  influenced  by  the 


experimental  conditions.  When  the  experimental  variable  was  peripheral 
to  the  examination  (i.e.,  when  it  occurred  before  the  actual  test  admin- 
istration), no  appreciable  effect  was  introduced  in  the  protocol. 

It  is  necessary,  before  reviewing  recent  pertinent  literature 
in  this  area,  to  discuss  the  nature  of  the  task  as  a  significant 
variable  in  demonstrating  the  effect  of  the  testing  conditions  on  the 
subject's  performance.  Projective  tests  are  coiranonly  regarded  as  un- 
structured and  ambiguous  stimuli.  Those  tests  which  are  characterized 
by  specific  instructions,  clearly  defined  stimuli,  and  right  and  wrong 
answers  are  regarded  as  structured  or  non-projective  tests.  The  ambi- 
g\iity  of  the  projective  testing  situation  enhances  the  probability  that 
the  examiner  and  the  subject  will  be  influenced  by  each  other  in  coti- 
pleting  the  testing  task. 

Masling  (i960)  summarizes  the  idea  that  the  subject  or  the 

examiner,  in  an  vmstructured  situation,  will  utilize  all  the  possible 

cues  by  stating: 

The  S  in  the  projective  test  setting  will  not  only  use  those 
cues  furnished  by  the  ink  blot  or  picture,  but  also  those  supplied 
by  his  feelings  about  the  examiner,  those  furnished  by  his  needs, 
attitudes  and  fears,  those  implied  in  the  instructions,  the  room, 
and  previous  knowledge  of  the  test,  and  those  cues  supplied  con- 
sciously or  unconsciously  by  E.  When  E  faces  the  ambiguous  sit- 
uation of  supplying  meaning  to  a  series  of  isolated,  discrete 
responses,  he  will  not  only  rely  on  S's  responses,  but  also  on 
those  cues  furnished  by  his  training  and  theoretical  orientation, 
his  own  needs  and  expectations,  his  feelings  about  S  and  the  con- 
structions he  places  on  S's  test  behavior  and  attitudes. 

The  implication  here  is  that,  with  well  structured  so-called 
non-projective  tests,  the  probability  that  situational  and  interpersonal 
variables  will  influence  test  performance  and  results  is  greatly  dimin- 
ished. In  other  words,  Masling  implies  that  because  the  testing 


situation  is  well  structured,  the  ability  of  the  examiner  to  be  objec- 
tive and  not  involved  personally  with  the  subject  ia  greatly  enhanced. 
The  following  review  of  the  literature  suggests  that  this  implication 
is  questionable. 

The  amount  of  research  on  the  examiner- subject  relationship  in 
intelligence  testing  is  small.  The  few  articles  reviewed  suggest,  how- 
ever, that  this  relationship  is  Important  to  the  subject's  score  and 
the  evaluation  of  results.  The  research  also  suggests  that  the  effect 
of  this  relationship  varies  in  intensity  with  different  groups.  Poorly- 
adjusted  people  are  more  affected  than  well-adjusted  people.  Further, 
there  is  the  indication  that  the  socio-economic  level  of  the  subject, 
his  cviltural  background,  and  his  present  life  situation  are  all  variables 
to  be  considered  in  evaluating  the  results  of  intelligence  tests. 

In  view  of  the  above  development,  the  author  felt  that  this 
experiment  would  be  a  significant  contribution  to  the  study  of  the  in- 
fluence of  situational  and  interpersonal  variables  in  intelligence 
testing. 

Research  with  Projective  Tests 

The  following  review  of  representative  studies  of  the  influence 
of  psychologist-subject  relationship  in  projective  testing  will  demon- 
strate that  this  relationship  can  modify  test  results. 

Lord  (1950)  used  three  different  atmospheres  of  Rorschach 
administration:  (l)  a  usual  standardized  administration  (2)  a  stand- 
ardized administration  with  negative  emotional  loading  in  irfiich  the 
examiner  assumed  the  role  of  a  cold  and  harsh  figure  demonstrating  no 


concern  for  the  subject,  and  (3)  a  standardized  administration  with 
positive  emotional  loading  in  which  the  examiner  was  warm  and  charming. 
Lord's  method  was  to  employ  three  female  examiners  and  have  each  subject 
take  the  Rorschach  three  times  (once  with  each  examiner  under  a  different 
atmosphere) .  She  not  only  found  that  the  different  methods  of  inter- 
action varied  the  protocols  elicited  from  the  same  sub.iects,  but  also 
that  a  greater  number  of  differences  occurred  when  the  examiners  gave 
the  test  in  their  normal  manner  than  when  they  gave  the  test  under 
conditions  of  assumed  rapport.  Lord  concluded  that  the  underlying 
personality  of  the  examiner  influences  the  subject's  Rorschach  test 
performance  to  a  greater  extent  than  any  assumed  rapport. 

Working  with  a  similar  examiner  variable  of  negative  and  positive 
rapport  and  with  the  Rorschach  test,  Luft  (1953)  found  that  the  subjects 
treated  in  a  warm  fashion  indicated  that  they  liked  a  significantly 
greater  number  of  inkblots  than  those  subjects  treated  in  a  cold  manner. 

Sanders  and  Cleveland  (1953)  approached  the  problem  of  examiner 
influence  on  Rorschach  scores  not  by  varying  the  mode  of  administration, 
as  the  previous  two  studies  did,  but  by  training  nine  male  graduate 
students,  unsophisticated  as  to  projective  techniques,  to  administer 
the  Rorschach  test  after  the  experimenters  obtained  a  personal  Rorschach 
from  each.  The  examiners  then  administered  twenty  Rorschachs  each  to 
undergraduate  subjects  who,  in  turn,  rated  the  examiners  on  measures  of 
overt  anxiety  and  hostility)  the  examiner's  covert  anxiety  and  hostility 
were  obtained  from  their  Rorschach  protocols.  On  the  basis  of  these 
measures,  Sanders  and  Cleveland  found  that  the  examiners  rated  in  a 
particular  way  (with  regard  to  hostility  or  anxiety)  tended  to  elicit 


Rorschachs  which  differed  significantly  from  those  Rorschachs  elicited 
by  examiners  with  different  ratings.  In  short,  different  examiners 
elicited  different  Rorschach  scores  from  their  subjects. 

Exploring  the  possibility  of  finding  significant  over-all  differ- 
ences in  Rorschach  protocols  obtained  by  various  examiners  with  similar 
backgrounds  from  a  homogeneous  group  of  patients  (white,  male  veterans, 
25-32  years  old  with  functional  rather  than  organic  ailments) ,  Gibbey, 
Miller  and  Walker  (1953)  analyzed  the  obtained  Rorschach  protocols  for 
examiner  influence  and  found  that  the  examiners  differed  from  each  other 
significantly  in  the  determinants  they  elicited  from  comparable  subjects. 

That  differential  instructions  can  influence  Rorschach  scores  is 
clearly  demonstrated  by  Henry  and  Rotter  (1956),  who  gave  the  control 
group  a  standard  administration,  but  informed  the  experimental  group 
that  the  test  is  used  to  discover  serious  emotional  disturbances.  The 
experimental  group  gave  significantly  more  cautious  and  conforming 
responses  than  the  control  group. 

Whether  or  not  the  examiner  is  present  during  the  testing  ap- 
pears to  be  a  significant  variable  in  influencing  test  performance. 
Bernstein  (1956),  using  the  TAT  under  conditions  of  examiner  present 
and  then  absent  for  both  written  and  oral  TAT  productions,  found  that 
the  only  significant  difference  in  the  stories  was  a  function  of  the 
examiner  being  present  or  absent.  The  results  indicated  that  the  ex- 
aminer's presence  acts  as  an  inhibiting  factor  for  strongly  emotional 
material , 

Not  only  does  the  examiner  appear  to  have  a  significant  influence 
on  projective  test  performance,  but  also  the  physical  surroundings  seem 


to  have  an  Influence.  Rabin,  Nelson,  and  Clark  (1954),  in  order  to 
study  the  effects  of  the  physical  surroundings  on  Rorschach  performance, 
used  two  experimental  groups  of  males  in  differently  decorated  waiting 
rooms  and  a  control  group  in  an  vindecorated  waiting  room.  One  experi- 
mental group  waited  in  a  room  decorated  with  anatomical  charts  and 
surgical  pictures^  the  other  group  waited  in  a  room  decorated  with 
photographs  of  nude  and  seminude  females.  No  significant  difference 
between  groups  in  the  number  of  anatomical  responses  was  found,  but 
there  was  a  significant  difference  in  the  number  of  sexual  responses. 
A  further  interesting  finding  of  this  study  was  that  those  subjects  who 
waited  in  the  room  decorated  with  pictures  of  nude  women  gave  signifi- 
cantly more  sexual  responses  to  the  male  examiner  than  to  the  female 
examiner . 

Two  studies  using  operant  conditioning  of  the  subject's  verbal 
behavior  further  demonstrate  that  cues  given  by  the  examiner  can  affect 
test  results.  Wickes  (1956)  used  a  control  group  and  two  experimental 
groups  for  a  series  of  homemade  inkblots  similar  to  the  Rorschach  test. 
Of  the  two  experimental  groups,  one  received  verbal  reinforcement  (e.g., 
"finfc"  and  "good")  to  movement  responses,  the  other  received  postural 
reinforcement  (e.g.,  nodding  and  smiling)  to  movement  responses.  By 
introducing  reinforcement  during  the  second  half  of  the  inkblot  series, 
Wickes  found  a  significant  increase  in  movement  responses  for  both  rein- 
forcement groups,  but  found  no  such  increase  in  the  control  group. 
Qross  (1959)  in  a  similar  study  reinforced  human  content  responses  on 
the  Rorschach,  verbally  to  one  experimental  group,  and  posturally  or 
non- verbally  to  another.  It  was  found  that  both  the  verbally  reinforced 


and  non- verbally  reinforced  groups  gave  significantly  more  huinan  re- 
sponses than  the  control  group  and  that  there  was  no  significant  dif- 
ference between  the  two  types  of  reinforcement. 

The  last  study  to  be  reviewed  in  the  area  of  the  influence  of 
interpersonal  and  situational  variables  on  projective  test  performance 
deals  with  the  influence  of  the  subject  upon  the  examiner.  Masling 
(1957)  used  two  female  subjects  ^o  acted  as  confederates  by  posing  as 
subjects  and  acting  warm  or  cold  to  a  group  of  naive  examiners  >rtio  had 
the  task  of  interpreting  sentence  completion  protocols  yielded  by  the 
subjects.  Masling  found  that  when  the  subject  acted  warm  to  the  ex- 
aminer her  protocol  was  interpreted  more  favorably  (i.e.,  she  appeared 
in  better  mental  health)  than  when  she  acted  coldly. 

Research  vAth  Non- Projective  Tests 
All  the  pertinent  research  in  the  area  of  measuring  the  examiner 
influence  on  "structured  non-projective"  testing  has  been  done  with 
individually  administered  intelligence  tests.  The  amount  of  research 
in  this  area  is  scant  when  compared  with  similar  investigations  in 

projective  testing. 

Intelligence  tests  are  regarded  as  objective  and  impersonal 
instruments  for  the  study  of  behavior.  The  stionuli  are  clearly  defined; 
the  answers  are  right  or  wrong j  the  instructions  are  specific,  and  the 
examiner  reads  the  questions  as  they  appear  in  a  manualj  the  answers  are 
scored  according  to  the  manual.  The  following  review  will,  however, 
demonstrate  that  there  is  reason  to  suspect  the  objectivity  of  intel- 
ligence tests. 


Lantz  (1945)  studied  the  effects  of  situational  variables  (i.e., 
success  or  failure)  upon  intelligence  test  performance.  Using  nine 
selected  tasks  from  Form  L  of  the  Revised  Stanford-Binet  and  nine  com- 
parable tasks  from  Form  M  as  measures  of  intelligence,  Lantz  found  that 
his  "failvire  group"  of  subjects — who  experienced  failure  in  a  game  in 
which  the  task  was  to  secure,  in  three  different  ways,  a  ball  from  a  box- 
experienced  a  lack  of  the  expected  average  test-retest  increase  in  score 
and  a  decrease  in  the  number  of  correct  responses  on  those  test  items 
which  involved  the  use  of  thought  processes.  The  "failure-group"  also 
demonstrated  a  decrease  on  the  ratings  of  willingness,  self-confidence 
and  attention.  The  "success-group,"  however,  demonstrated  an  increase 
of  average  scores  on  the  mental  tasks  with  a  decrease  in  score  vari- 
ability. 

Getting  closer  to  the  examiner- subject  relationship,  Staudt 
(1948)  gave  three  groups  of  college  students  tests  of  verbal  analogies, 
arithmetic,  and  cancellation.  The  first  group  was  tested  under  normal 
conditions.  The  second  group  was  instructed  to  work  accurately.  The 
third  group  was  instructed  to  work  very  rapidly  and  was  subjected  to 
tension-evoking  conditions  (e.g.,  a  buzzer  sounded  eveiy  thirty  seconds, 
at  which  times  the  examiner  stated  how  many  items  should  have  been  com- 
pleted) .  These  conditions  produced  feelings  of  inferiority  in  the  sub- 
jects, because  the  level  of  attainment  was  set  beyond  their  capacities. 
Staudt 's  results  demonstrated  significantly  more  errors  for  the  tension 
group  and  no  difference  between  the  other  two  groups. 

Hutt  (1947)  compared  IQ  ratings  obtained  by  "consecutive"  test- 
ing (i.e.,  normal  procedtire)  and  "adaptive"  testing  (i.e.,  procedure  of 
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following  every  failure  by  an  easier  item)  on  Form  L  of  the  Revised 
Stanford-Binet.  The  results  showed  that  well-ad.justed  children  make 
similar  scores  with  the  two  methods  of  administration,  whereas  poorly- 
adjusted  children  make  higher  IQ  ratings  with  the  "adaptive"  testing. 
The  suggestion  here  is  that,  because  some  children  experience  an  increas- 
ing sense  of  failure  with  the  "consecutive"  testing,  they  may  therefore 
be  unable  to  succeed  on  later  test  items,  Hutt  felt  that  the  "adaptive" 
testing  yielded  more  valid  I(^s  with  the  poorly-adjusted  children. 

Directly  manipulating  the  nature  of  the  social  relationship 
between  the  examiner  and  the  subject  and  detei-ralning  the  effect  of  such 
upon  intelligence  test  performance  by  establishing  (l)  a  good  relation- 
ship, (2)  a  poor  relationship,  and  (S)  a  control  group  (i.e.,  no  rela- 
tionship), respectively,  with  three  groups  of  nursery  school  children. 
Sacks  (1952)  found  in  all  three  groups  a  mean  increase  in  IQ  from  Form  L 
of  the  Revised  Stanford-Binet  to  Form  M.  The  children  in  the  "relation- 
ship" groups  demonstrated  a  significantly  greater  change  than  did  those 
in  the  control  group. 

Masllng  (1959),  investigating  the  effects  of  the  subject  upon 
the  examiner's  administration  and  scoring  of  selected  subtests  from  the 
Wechsler-Bellevue  II,  had  two  subjects  act  either  warm  or  cold  to  na'i've 
examiners  who  were  sophisticated  in  intelligence  testing.  The  subjects 
also  gave  memorized  responses,  most  of  which  were  devised  to  be  diffi- 
cult to  score.  The  study  thus  compared  the  examiner's  treatment  of 
warm  and  cold  subjects,  and  the  results  showed  conclusively  that  to 
the  warm  subjects,  the  examiner  was  more  lenient,  used  more  reinforcing 
statements,  and  gave  more  opportunity  to  clarify  or  correct  responses. 
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Hypotheses 
In  light  of  the  above  research,  the  specific  hypotheses  which 
were  tested  were  as  follows: 

I.  There  will  be  a  consistent  shift  in  intelligence  test 
performance  as  a  result  of  the  different  experimentally  established 
negative  and  positive  interactions  between  the  examiner  and  the  subject, 

II.  This  shift  will  most  likely  reflect  a  decrement  in  score 
for  the  majority  of  the  subjects  under  the  negative  treatment  condition. 

III.  The  subject's  ratings  of  the  examiners  will  be  clearly 
different  as  a  result  of  the  different  treatment  conditions. 

IV.  The  subjects  will  perceive  the  negative  treatment  condi- 
tion examiners  in  an  unfavorable  light,  whereas  they  will  perceive  the 
positive  treatment  condition  examiners  in  a  favorable  light. 


CHA.PTER  II 


EXPERIMENTAl  PROCEDURE 


Subjects 

The  subjects  used  in  this  experiment  were  48  male  xindergraduate 
students  at  the  University  of  Florida.  They  acted  as  their  own  controls 
and  were  tested  in  both  treatments  yielding  a  total  of  96  scores  for 
comparison.  Most  of  them  were  sophcmores  and  juniors  and  were  selected 
from  a  course  in  general  psychology. 

For  pu3?poses  of  sampling  a  wide  range  of  intellectual  ability, 
the  A.C.E,  was  used  as  a  means  of  selecting  subjects.  The  wide  range 
was  developed  by  selecting  the  subjects  on  the  basis  of  divisions  of 
ranges  of  percentile  ranks:  one- sixth  of  the  subject  population  were 
drawn  from  range  24  and  below}  one-sixth  from  25  to  39}  one-sixth  from 
40  to  54j  one-sixth  from  55  to  69;  one-sixth  from  70  to  84}  one-sixth 
from  85  and  above.  This  division  meant  that  there  should  have  been 
8  subjects  within  each  range. 

Because  5  subjects  did  not  appear  and  because  the  source  of 
alternate  subjects  was  limited,  the  actual  number  of  subjects  per  di- 
vision worked  out  as  follows:  7  subjects  from  range  24  and  below| 
6  from  25  to  39j  7  from  40  to  54j  10  from  65  to  69;  9  frOTi  70  to  84} 
and  9  from  85  and  above. 

As  will  be  seen  in  the  Procedure  section  of  this  chapter,  each 
subject  followed  a  unique  pattern  of  testing.  It  was  essential  that 
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the  subject  follow  his  pattern  to  completion  for  meaningful  data  to 
occur.  All  of  the  subjects  cooperated  except  one  who  dropped  out  be- 
fore he  cor.pleted  his  pattern.  Minor  modifications  in  the  statistical 
treatment  of  the  data  were  made  and  will  be  discussed  as  they  are 
encountered . 

Examiners 

Hammond  (1954)  criticizes  studies  on  the  examiner  effect  in 
psychological  testing  on  the  basis  that  they  fail  to  take  an  adequate 
sample  of  the  examiner  population  and  thus  limit  the  degree  to  which 
results  can  be  generalized  to  larger  groups  of  examiners  and  subjects. 

The  present  experiment  attempted  to  overcome  this  weakness  in 
the  independent  vaidable  by  using  eight  male  graduate  students  in  psy- 
chology as  examiners.  Each  examiner  had  completed  a  course  in  individ- 
ual intelligence  test  administration  and  scoring,  in  addition  to  having 
had  experience  along  these  lines  in  a  practicum  setting. 

Materials 
Intelligence  Test: 

Two  series  of  five  subtests  (Pentads)  each  from  the  Wechsler 
Adult  Intelligence  Scale  were  used  as  measures  of  intelligence  test 
performance.  Pentad  I  consisted  of  subtests:  Information,  Similarities, 
Vocabulary,  Picture  Arrangement,  and  Object  Assembly.  Pentad  II  con- 
sisted of  subtests;  Comprehension,  Arithmetic,  Digit  Symbol,  Picture 
Completion  and  Block  Design. 

According  to  Maxwell  (1957),  who  followed  up  the  work  of 
McNemar  (i960)  and  Doppelt  (1956)  on  estimating  full  scale  IQ  from 
short  forms  of  the  WAIS,  Pentad  I  correlates  .972  with  full  scale  score 
and  Pentad  II  correlates  .966  with  full  scale  score.  These  correlations 
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receive  indirect  support  from  the  research  of  Howard  (1959)  and  Clayton 
and  Payne  (1959)  who  tested  the  validity  of  similar  short  forms. 

Thus  these  two  Pentads  estimate  with  little  error  variance  the 
measurement  of  intelligence  as  shown  by  the  full  scale  WAIS.  Because 
each  fails  to  account  for  approximately  .06  of  the  total  variance,  these 
two  Pentads  could,  at  a  very  minimum,  correlate  .94  with  each  other. 
On  this  basis  it  was  felt  that  the  two  Pentads  could  be  regarded  as 
equivalent  enough  to  compare  the  effects  of  the  treatment  conditions. 
Standard  WAIS  record  forms  were  used  to  record  the  data. 

Rating  Scale j 

The  rating  scale  consisted  of  twenty- seven  personal  adjectives 
representative  of  four  major  areas  of  personality  (Dominance,  Activity, 
Social  Sensitivity,  and  Mood)  which,  it  was  felt,  is  pertinent  to  psy- 
chological testing.  The  rating  was  along  a  continuum  of  Strongly  Agree 
to  Strongly  Disagree. 

The  rating  scale  was  derived  from  the  SAQS  Chicago  Q  Sort 
(Corsini,  1956),  which  seemed  to  provide  a  valuable  research  tool  for 
this  experiment,  because  it  contains  a  n\imber  of  previously  selected 
personal  adjectives  describing  people.  These  adjectives  are  applicable 
for  the  study  of  the  interaction  between  the  examiner  and  the  subject. 
Table  1  presents  the  actual  rating  scale  and  Table  5,  the  adjectives 
grouped  according  to  major  personality  areas  and  a  comparison  of  the 
subject's  ratings  of  the  examiners  under  the  two  different  treatment 
conditions. 


IS 


TABLE  1 
RATING  SCALE 

Name 


The  following  words  are  descriptions  of  people.  You  are  to  describe 

your  Examiner  by  encircling  the  letters 

which  signify  whether  you  Strongly  Agree  (SA),  Agree  (a),  Undecided  (U), 
Disagree  (D),  and  Strongly  Disagree  (SD),  with  the  descriptive  word. 
The  results  of  the  ftating  Scale  will  be  held  in  strict  confidence,  so 
please  be  frank. 


1.  Aggressive  SA.  A  U  D  SD 

2.  Quick  SA  A  U  D  SD 

3.  Warm-hearted  SA  A  U  D  SD 

4.  Easy-going  SA  A  U  D  SD 

5.  Forceful  SA  A  U  D  SD 

6.  Hasty  SA  A  U  D  SD 

7.  Soft-hearted  SA  A  U  D  SD 

8.  Calm  SA  A  U  D  SD 

9.  Independent  SA  A  U  D  SD 

10.  Hurried  SA  A  U  D  SD 

11.  Gentle  SA  A  U  D  SD 

12.  Worrying  SA  A  U  D  SD 

13.  Stubborn  SA  A  U  D  SD 

14.  Talkative  SA.  A,  U  D  SD 

15.  Appreciative  SA.  A  U  D  SD 

16.  Emotional  SA  A  TJ  D  SD 

17.  Dominant  SA  A  TJ  D  SD 

18.  Active  SA  A  U  D  SD 
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TABLE  1— (Continued) 


19. 

Discreet 

SA 

A 

U 

D 

SD 

20. 

Excitable 

SA. 

A 

U 

D 

SD 

21. 

Outspoken 

SA 

A 

u 

D 

SD 

22. 

Unselfish 

SA. 

A 

u 

D 

SD 

23. 

Subnissive 

SA 

A 

u 

D 

SD 

24. 

Insensitive 

SA 

A 

u 

D 

SD 

25. 

Dependent 

SA 

A 

u 

D 

SD 

26. 

Sensitive 

SA 

A 

u 

D 

SD 

27. 

Sarcastic 

SA 

A 

u 

D 

SD 
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Negative  and  Positive  Treatment  Conditions 
In  order  to  evaluate  adequately  the  results  of  this  experiment, 
It  was  Important  to  develop  some  degree  of  standardization  In  the  admin- 
istration of  the  negative  and  positive  treatment  conditions.  Since  the 
greatest  part  of  the  administration  consists  of  a  verbal  interaction 
between  the  examiner  and  subject,  procedures  were  developed  to  create  a 
difference  between  positive  and  negative  treatments  in  terms  of  the 
manner  of  administration  rather  than  in  terms  of  a  change  in  the  actual 
directions  of  administration  which  appear  in  the  MAIS  manual  (Wechsler, 
1955) .  It  was  also  felt  that  this  approach  woiald  generalize  the  results 
to  the  actual  day-by-day  professional  use  of  the  WAIS. 

The  treatment  conditions  are  defined  in  the  following  instructions 
to  examiners. 

Positive  Administration >  In  general,  this  administration  is 
characterized  by  a  warm  and  highly  approving,  interested  manner  of 
test  administration.  The  examiner  is  prone  to  give  reinforcing 
statements  in  a  warm  tone  of  voice,  i.e.,  he  will  be  personally 
warm  and  appreciative  in  manner.  The  examiner  will  speak  in  an 
encouraging  tone  of  voice  and  look  at  the  subject  with  a  smile  while 
asking  questions,  preparing  tests,  or  giving  directions.  In  other 
words,  he  will  make  every  "Hm!"  sound  like  a  compliment  for  work 
well  done.  The  specific  activities  of  the  examiner  are: 

1.  Introduce  yourself  to  the  subject  and  shake  hands  with  him. 

2.  Look  at  the  subject  while  talking  to  him. 

3.  Use  a  tone  of  voice  which  reflects  warmth  and  Interest, 
e.g.,  speak  slowly,  softly,  and  clearly  and  vary  the 
intonation  pattern. 

4.  Demonstrate  appreciation  of  the  subject's  responses  by 
stating  such  things  as  "good,"  "you're  doing  O.K.,"  etc, 

5.  Wait  patiently  for  each  response. 

6.  Appear  alert  to  everything  that  the  subject  says,  i.e., 
avoid  appearing  bored  or  tired. 
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7.  Be  quick  to  smile  at  appropriate  instances,  e.g.,  for  a 
good  response  or  when  the  subject  smiles. 

8.  When  a  subtest  is  finished,  the  transition  to  the  next 
subtest  should  be  facilitated  by  such  remarks  as,  "Here  is 
something  of  a  different  sort"  or  "I  think  that  you  will 
find  this  interesting."  Never  go  from  one  subtest  to 
another  without  saying  something  of  a  positive  and  reward- 
ing nature. 


Negative  Administrationt  In  general,  this  administration  is 
characterized  by  a  persistently  rejecting  and  disinterested  manner 
of  test  administration.  The  examiner  is  prone  to  make  punishing 
statements,  consisting  of  remarks  and  actions  designed  to  insult 
the  subject  or  the  response  made,  i.e.,  the  examiner  will  assume 
the  role  of  a  harsh,  rejecting,  authoritarian  figure.  He  will  be 
deliberately  unconcerned  about  the  subject  and  will  not  look  at  him 
while  asking  questions,  preparing  tests,  or  giving  directions;  he 
will  never  smile  and  will  give  directions  in  a  voice  of  dictatorial 
harshness,  making  every  "Hm!"  sound  like  a  sneer.  The  specific 
activities  of  the  examiner  are: 

1.  Do  not  introduce  yourself  to  the  subject  and  do  not  ac- 
knowledge his  attempts  at  introduction. 

2.  Never  look  at  the  subject  vdiile  talking  to  him. 

S.  Use  a  tone  of  voice  vriiich  reflects  coldness  and  disinterest, 
e.g.,  speak  rapidly  but  clearly  in  a  steady  monotone  with 
no  variance  of  the  intonation  pattern. 

4.  Demonstrate  rejection  and  lack  of  appreciation  by  frowning 
and  not  saying  anything  while  the  response  is  being  given. 

5.  Demonstrate  impatience  by  resorting  to  such  things  as  tap- 
ping the  table  with  your  pencil,  looking  up  at  the  ceiling, 
and  becoming  fidgety  as  the  subject  responds. 

6.  Create  the  impression  of  being  bored  by  sighing  heavily 
upon  occasion  and  manhandling  the  test  materials  in  order 
to  create  an  impression  of  distaste  for  idiat  is  being  done. 

7.  Do  not  smile  and  do  not  respond  to  the  subject's  attempts 
to  be  pleasant  during  the  testing. 

8.  When  a  subtest  is  finished,  the  transition  to  the  next  sub- 
test should  be  characterized  by  immediately  launching  into 
the  administration  with  no  preliminary  remarks  other  than 

a  disdainful  grunt. 
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Specific  application  of  these  instructions  to  the  various  subtests 

are  J 

Pentad  I >  Information,  Similarities,  and  Vocabulary  are  mainly- 
test  s~3iivoTving  a  verbal  interaction  between  the  subject  and  the  ex- 
aminer. The  above  suggestions  for  Positive  and  Negative  treatments 
should  be  applied  throixghout  the  testing. 

With  Picture  Arrangement  and  Object  Assembly,  there  is  an  oppor- 
tunity, through  the  actual  manipulation  of  test  materials,  to  delin- 
eate further  the  difference  between  Positive  and  Negative  treatments. 
In  the  Positive  treatment  the  examiner  presents  the  materials  as 
they  are  normally  presented.  In  the  Negative  treatment  the  examiner 
presents  the  materials  by  forcefully  putting  them  on  the  table  and 
removing  them  in  the  same  manner. 

Pentad  II:  Comprehension  and  Arithmetic  are  to  be  handled  the 
same  as  Information,  Similarities,  and  Vocabulary  are  handled  in 
Pentad  I  for  the  Positive  and  Negative  treatment. 

Digit  Symbol,  Picture  Completion,  and  Block  Design  are  to  be 
handled  in  the  same  manner  as  Picture  Arrangement  and  Object  As- 
sembly are  handled  in  Pentad  I  for  the  Positive  and  Negative  treat- 
ment. 


Procedure 

To  reduce  the  effects  of  order  and  sequence  vdiich  introduce  con- 
founding errors  of  measurement,  the  experiment  followed  a  pattern  of 
counterbalancing . 

The  eight  examiners  were  broken  down  into  four  blocks  with  two 
examiners  in  a  block.  Within  each  block  there  were  twelve  subjects 
selected  proportionately  from  the  A.C.E.  divisions  discussed  earlier. 
All  of  the  subjects  acted  as  their  own  controlsj  thus  they  experienced 
both  treatment  conditions,  but  each  from  a  different  examiner  within 
a  block. 

Since  possible  sources  of  error  could  develop  from  the  examiners 
administering  one  form  of  the  treatment  condition,  the  examiners 
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alternated  between  negative  and  positive  administrations,  so  that  each 
examiner  gave  a  total  of  six  negative  administrations  and  six  positive 
administrations  with  a  pattern  of  alternating  from  one  treatment  con- 
dition to  another  throughout  the  sequence  of  twelve  administrations. 
To  control  further  any  possible  sequence  effect,  the  subjects  were 
assigned  on  a  counterbalanced  basis  to  various  examiners  and  treatment 
conditions.  More  specifically,  a  subject  who  received  an  administration 
early  in  that  examiner's  sequence  did  not  receive  his  second  administra- 
tion until  late  in  another  examiner's  sequence. 

To  control  for  the  Pentads  themselves  introducing  error  by  being 
used  consistently  in  one  treatment  condition,  the  Pentads  were  counter- 
balanced with  regard  to  the  use  of  one  Pentad  with  one  treatment  con- 
dition. Thus  the  Pentads  were  used  as  negative  and  positive  treatment 
stimuli  equally. 

The  experiment  was  run  in  the  evening,  each  examiner  working 
in  a  separate  office  testing  three  subjects  each  evening  for  a  total  of 
four  evenings.  The  subjects  each  appeared  twice  on  consecutive  evenings. 

All  of  the  subjects  wei*e  na'ive  about  the  purpose  of  the  experiment. 
They  were  told  that  the  experiment  was  an  investigation  in  the  area  of 
psychological  testing  and  that  their  part  in  the  experiment  would  be  to 
answer  a  list  of  standardized  questions. 

At  the  end  of  the  second  testing  session  the  subjects  were  given 
two  copies  of  the  rating  scale  appearing  in  Table  1.  The  subjects  were 
asked  to  rate  their  "first  examiner"  and  their  "second  examiner"  in 
order  to  maintain  subject  naivete''  during  this  last  phase  of  the  exper- 
iment. The  subjects  were  then  asked  not  to  discuss  the  experiment  with 
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anyone  until  they  were  informed  that  the  experiment  was  over. 

Scale  scores  for  the  subtests  were  used  as  data  in  comparing 
the  effects  of  the  treatments.  The  Pentads  were  scored  by  the  exper- 
imenter using  the  recommended  scoring  which  appears  in  the  WAIS  manual. 


CHA.PTER  III 

RESULTS  AM)  DISCUSSION 

Results 
The  design  of  this  experiment  allowed  an  analysis  of  variance 
to  be  done  on  the  subtest  scale  score  totals  for  the  Pentads  under  the 
different  treatment  conditions.  Table  2  presents  the  results  of  the 
analysis  of  variance.  The  missing  subject  discussed  earlier  was  handled 
by  replacing  his  missing  score  by  a  value  equal  to  the  mean  of  the  other 
scores  in  that  subject's  treatment-cell  and  subtracting  one  degree  of 
freedom  from  the  degrees  of  freedom  for  total  and  consequently  from  the 
degrees  of  freedom  for  error  also. 

The  insignificant  F  for  the  variance  betvreen  examiners  indicates 
a  homogeneity  between  examiners  suggestive  of  their  handling  the  treat- 
ment conditions  in  a  very  similar  manner.  Table  3  sixggests  this,  be- 
cause a  high  degree  of  correspondence  exists  for  subtest  totals  from 
block  to  block  for  both  treatments. 

The  insignificant  interaction  between  examiners  and  treatments 
can  also  be  interpreted  along  the  lines  of  examiner  similarity  in  han- 
dling, the  different  treatment  conditions.  The  main  implication  here  is 
that  the  treatments  had  the  same  relative  effects  throughout  the  blocks 
or  levels. 

The  F  of  2.86  for  the  variance  between  negative  and  positive 
treatments  falls  short  of  the  F  of  about  4.0  required  for  significance 
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at  the  .05  level.  This  F  approaches  significance,  and  there  are  probably- 
many  reasons  for  a  significant  difference  not  appearing  at  the  present 
time.  This  F  is  probably  the  most  important  phase  of  the  statistical 
findings,  and  it  tends  to  support  a  hypothesis  that  the  attitude  of  the 
examiner  has  little  effect  on  test  results  in  well- structured  testing. 
Variables  to  be  discussed  later  not  included  or  manipulated  in  this 
experiment  suggest  that  this  hypothesis  may  be  unwarranted. 

To  present  the  direction  of  the  shift  of  scores  from  one  treat- 
ment condition  to  the  other  and  to  determine  general  consistency  of  this 
shift,  the  estimated  average  Full  Scale  IQ  per  block  of  examiners  for 
both  treatment  conditions  is  listed  in  Table  5.  It  is  apparent  that 
the  shift  represents  a  consistent  decrement  in  scoi^  in  the  negative 
treatment  condition.  Again  there  is  the  suggestion  that  the  Pentads 
are  equivalent  and  have  little  to  do  with  confounding  the  results  because 
of  the  counterbalancing  of  the  Pentads.  Although  examiners  5  and  4 
experience  a  mild  shift  while  examiners  7  and  8  experience  a  more 
severe  shift,  it  is  observed  that  the  difference  in  shift  is  not  in- 
tense enough  to  create  a  significant  F  for  interaction. 

It  was  felt  that  an  examination  of  the  effect  of  the  treat- 
ments on  the  individual  subtests  would  be  of  interest  and  value. 
Table  4  lists  the  results  of  a  series  of  t  tests  done  on  each  subtest 
comparing  scale  scores  for  both  treatment  groups. 

A  general  decrement  in  score  occurs  within  the  negati-ve  treat- 
ment group  for  Information,  Comprehension,  Similarities,  Vocabulary, 
Block  Design,  and  Picture  Arrangement.  A  slight  increase  in  score  for 
the  negative  treatment  appears  for  Digit  Symbol,  Picture  Completion, 
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and  Object  Assembly,  while  Arithmetic  demonstrates  no  change  at  all. 
These  differences  are  insignificant  except  for  Vocabulary,  which  is 
significant  at  the  .02  level  of  significance.  Why  this  particular  sub- 
test should  demonstrate  significance  will  be  discussed  later. 

Table  5  presents  a  comparison  of  the  direction  of  the  subject's 
ratings  of  the  examiners  under  the  positive  and  negative  treatment  con- 
ditions using  the  rating  scale  presented  in  Table  1.  Table  5  also  groups 
the  descriptive  adjectives  used  for  the  ratings  according  to  major  per- 
sonality areas. 

The  comparisons  of  whether  or  not  the  subjects  agreed  with  the 
adjective  as  being  descriptive  of  the  examiner  were  made  using  the 
chi-square  technique  as  a  measure  of  the  significance  of  difference  of 
the  agree-disagree  ratings  under  each  treatment  condition.  Those  adjec- 
tives which  were  found  to  be  significant  are  checked  in  order  to  indicate 
the  direction  of  the  rating  under  each  treatment  condition. 

Of  the  twenty-seven  adjectives  it  is  noticed  that  eighteen  achieve 
a  significant  level  of  difference  in  ratings  of  the  examiners.  It  is 
probably  noteworthy  that  under  the  personality  area  of  Social  Sensitivity 
all  nine  of  the  adjectives  are  significant,  whereas  the  area  of  Domin- 
ance yields  only  three  out  of  eight  significant  differences,  Activity 
yields  three  out  of  five,  and  Mood  also  yields  three  out  of  five  sig- 
nificant differences. 

More  specifically,  examiners  under  the  positive  treatment  are 
seen  by  the  subjects  as  talkative,  warm-hearted,  gentle,  appreciative, 
discreet,  iinselfish,  sensitive,  easy-going,  and  calm.  The  same  exam- 
iners under  the  negative  treatment  condition  are  not  seen  as  possessing 
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these  qualities.  The  negative  treatment  produced  a  picture  of  the 
examiners  as  being  forceful,  stubborn,  dominant,  hasty,  hurried,  in- 
sensitive, sarcastic,  and  worrying.  Conversely,  the  same  examiners 
under  the  positive  treatment  do  not  receive  these  ratings. 
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TABLE  2 
MA.LYSIS  OF  VMIMCE 


Source 

DF 

MS 

F 

P 

E I  Exaininers 

S 

5.97 

<  1 

Not  sig. 

T:  Treatments 

1 

150.66 

2.86 

>  .05 

E  X  T 

3 

50.19 

<  1 

Not  sig. 

Error  Terra 

87 

45.75 

Total 

94 

TABLE  3 


ESTIMATED  AVERAGE  FULL  SCALE  IQ's  FOR 
POSITIl/E  AM)  NEGATIVE  TREATMENTS 


Blocks 

Positive 

Treatment 

Negative 
Pentad 

Treatment 

Pentad 

Average  IQ 

Average  IQ 

1  and  2 

1 

lis 

2 

114 

3  and  4 

Z 

114 

1 

114 

5  and  6 

1 

117 

2 

115 

7  and  8 

g 

118 

1 

111 

TABLE  4 

INDIVIDUAL  SUBTEST  SCALE  SCORES  UNDER 
EACH  TREATMENT  CONDITION 
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Subtest 


Condition   Mean 


Mean 
Difference 


SD 


Information 
Comprehension 
Arithmetic 
Similarities 
Vocabulary 
Digit  Symbol 
Picture  Completion 
Block  Design 
Picture  Arrangement 
Object  Assembly 


Positive 

13.36 

.56 

Negative 

13.00 

Positive 

13.35 

1.00 

Negative 

12.35 

Positive 

12.42 

0 

Negative 

12.42 

Positive 

13.25 

1.00 

Negative 

12.25 

Positive 

13.43 

1.35 

Negative 

12.08 

Positive 

11.69 

-  .13 

Negative 

11.82 

Positive 

12.07 

-.39 

Negative 

12.46 

Positive 

12.13 

.56 

Negative 

11.57 

Positive 

10.96 

.30 

Negative 

10.66 

Positive 

10.17 

-  .87 

Negative 

11.04 

1.60    1.02  Not  sig. 


4.15    1.13  Not  sig. 


5.31    0     Not  sig. 


3.09    1.51  Not  sig. 


2.39    2.65    .02 


2.95   -  .21  Not  sig. 


2.80   -  .66  Not  sig. 


4.26     .58  Not  sig. 


2.44     .51  Not  sig. 


4.32   -  .94  Not  sig. 
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TABLE  5 

COMPARISON  OF  SUBJECTS'  RATINGS  OF  EXAMINERS 
UNDER  POSITIVE  AM)  NEGATIVE  TREATMENTS 


Adjective 

DF 

x2 

P 

Positive 
Agree 

Treatment 
Disagree 

Negative 
Agree 

Treatment 
Disagree 

DOMINANCE 

Aggressive 

S 

.88 

<.80 

Forceful 

4 

12.36 

<.01 

X 

X 

Independent 

3 

3.92 

<.50 

Stubborn 

3 

25.80 

<.01 

X 

X 

Dominant 

S 

17.56 

<.01 

X 

X 

Outspoken 

4 

3.54 

<.50 

Submissive 

4 

6.98 

>.10 

Dependent 

3 

.82 

<.90 

ACTIVITY 

Quick 

S 

.24 

<.98 

Hasty- 

3 

28.92 

>.01 

X 

X 

Hurried 

3 

31.20 

>.01 

X 

X 

Talkative 

3 

46.02 

>.01 

X 

z 

Active 

3 

2.80 

<.50 

SOCIAL 

SENSITIVITY 

Warm-hearted 

3 

48.27 

>.01 

X 

X 

Soft-hearted 

3 

29.04 

>.01 

X 

z 

Gentle 

3 

44.25 

>.01 

X 

X 

Appreciative 

5 

38.12 

>.01 

X 

X 

Discreet 

4 

10.52 

>.06 

X 

X 

Unselfish 

3 

21.90 

>.01 

X 

X 

Insensitive 

3 

21.88 

>.01 

X 

X 

Sensitive 

3 

11.28 

.01 

X 

X 

Sarcastic 

3 

24.06 

>.01 

X 

X 

MOOD 

Easy-going 

4 

44.92 

>.01 

X 

X 

Calm 

3 

24.80 

>.01 

X 

X 

Worrying 

3 

29.12 

>.01 

X 

X 

Emotional 

4 

4.06 

<.30 

Excitable 

4 

8.72 

>.10 
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Discussion 

With  regard  to  Hypothesis  I~that  there  will  be  a  consistent 
shift  in  intelligence  test  performance  as  a  result  of  the  different 
experimentally  established  negative  and  positive  interactions  between 
the  examiner  and  the  subject— little  support  is  received  by  the  results 
of  this  study.  There  was  a  consistent  shift,  but  it  was  at  an  insig- 
nificant level  of  significance  (P  >  .05). 

Perhaps  different  results  could  be  attained  by  utilizing  a  dif- 
ferent sampling  procedure.  Studying  class  attitudes  towards  psychiatry, 
Redlich,  Hollingshead,  and  Bellis  (1955)  found  that  there  was  a  higher  de- 
gree of  rapproachment  between  the  value  systems  of  psychotherapists  and  of 
middle-class  patients  than  between  the  value  systems  of  psychotherapists 
and  lower-class  patients.  Evidence  for  this  conclusion  was  found  in  the 
facts  that  lower-class  patients  did  not  enter  therapy  voluntarily,  the 
therapists  demonstrated  a  greater  dislike  of  lower-class  patients  than  of 
middle-class  patients,  and  poor  communication  existed  between  therapists 
and  lower-class  patients.  The  sample  used  in  this  study,  for  both  exam- 
iners and  subjects,  were  very  much  alike  in  class  attitude  and  values 
because  of  their  identification  with  the  academic  atmosphere. 

By  generalizing  from  the  Redlich  et  al.  study  to  motivational 
factors  and  the  establishment  of  rapport  in  intelligence  testing,  it  is 
possible  to  suggest  different  sets  of  variables  coming  into  play  with 
such  groups  as  maladjusted  individuals,  minority  groups,  or  persons  with 
different  social  norms  and  values.  If  these  variables  were  taken  into 
consideration  and  manipulated  in  an  experiment  similar  to  the  present 
study  they  could  well  demonstrate  their  importance  by  being  reflected 
in  more  significant  results. 
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Another  possibility  is  for  a  related  study  in  which  the  personal- 
ities of  the  examiners  and  subjects  are  paramount.  Although  the  experi- 
mental difficulties  in  such  a  study  are,  of  course,  great— especially  in 
respect  to  evaluating  the  examiner's  personality—this  interesting  prob- 
lem merits  consideration. 

Hypothesis  II— that  the  shift  in  performance  will  most  likely 
reflect  a  decrement  in  score  for  the  majority  of  the  subjects  under  the 
negative  treatment  condition — receives  support,  although  not  significant 
support,  from  the  results.  No  explanation  of  these  results  can  be 
derived  from  this  experiment.  A  possible  explanation  may  be  found  in 
related  research  dealing  with  the  detrimental  effects  of  anxiety  and 
tension  upon  intelligence  test  performance.  Sarason  et  al.  (1952)  found 
that,  when  subjects  with  a  low-anxiety  rating  were  given  ego-involving 
instructions  on  a  stylus  maze,  a  slight  increase  in  performance  resulted. 
Subjects  with  a  high-anxiety  rating,  however,  did  poorly  under  ego- 
involving  instructions.  Wiener  (1957),  using  "distrustfulness"  and 
"suspiciousness"  as  independent  variables,  found  that  subjects  high 
in  these  traits  tend  to  get  lower  IQ's  because  they  hold  back  full  an- 
swers or  actually  deny  the  implications  of  test  questions. 

The  suggestion  from  such  related  research  is  that  the  "anxiety" 
and  "distrust"  aroused  by  the  negative  treatment  condition  tends  to 
lower  intelligence  test  scores.  Again  there  are  implications  for 
further  research  within  the  present  experimental  framework. 

The  results  of  the  rr.tlng  scale  support  both  Hypothesis  III — 
that  the  subject's  ratings  of  the  examiners  will  be  clearly  different 
as  a  result  of  the  different  treatment  conditions — and  hypothesis  IV — 
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that  subjects  will  perceive  the  negative-treatment  examiners  in  an 
unfavorable  light,  whereas  they  will  perceive  the  positive-treatment 
examiners  in  a  favorable  light.  These  results  also  indirectly  support 
the  experimental  reality  of  the  two  treatment  conditions  in  that  the 
same  examiners  received  different  ratings  in  accordance  with  the  treat- 
ment conditions. 

As  a  means  of  studying  the  effect  and  operating  traits  of  the 
two  treatment  conditions,  the  rating  scale  does  an  adequate  job.  It 
is  a  step  in  the  direction  of  studying  the  actual  interaction  between 
the  examiner  and  the  subject.  This  study  was  concerned  mainly  with  ex- 
perimentally established  interpersonal  variables  influencing  test  per- 
formance and  not  with  an  attempt  to  determine  how  these  variables 
actually  affect  the  subject.  Such  a  study  would  require  an  analysis 
of  the  interaction  process  itself.  In  the  area  of  research  in  psycho- 
therapy such  techniques  are  being  developed  (Rubinstein  and  Parloff, 
1959). 

Of  the  individual  subtests,  Vocabulary  was  the  only  one  demon- 
strating a  significant  decrement  in  score  as  a  result  of  the  negative 
condition.  The  mental  processes  involved  in  this  subtest  are  generally 
regarded  as  the  ability  to  recall  previously  acquired  verbal  meanings 
with  variations  in  the  quality  of  responses  yielded.  Perhaps  these 
processes  are  upset  by  negative  influencesj  in  other  words,  the  quality 
of  recall  and  response  may  suffer  as  a  result  of  the  negative  treatment 
condition. 

Although  the  results  of  this  experiment  do  not  significantly 
demonstrate  a  decrement  in  Intelligence  test  scores,  the  other  results 
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of  the  experiment  and  related  areas  of  experimentation  discussed  in 
the  previous  paragraphs  suggest  further  research  along  this  line  with 
structured  psychological  tests.  The  assumption  that  struct\ired  tests 
are  not  affected  by  interpersonal  and  situational  variables  is  still 
questionable. 

If  examiner- subject  relationships  can  be  demonstrated  with 
structvired  tests,  the  interpretation  of  test  results  must  go  beyond  a 
purely  psychometric  interpretation.  Schafer  suggests  that  the  testing 
situation  be  looked  upon  as  a  social  situation  from  which  to  derive  data 
in  order  to  understand  the  subject  better.  This  approach  to  testing 
necessarily  increases  the  possibility  of  personalized  interpretation, 
but  it  also  allows  for  a  greater  inclusion  of  data.  For  this  "total- 
situation"  approach  to  be  admissible  to  psychological  testing,  a  system 
of  handling  accurately  great  complexities  of  data  must  be  worked  out, 
in  order  to  satisfy  the  criterion  of  objectivity  in  test  interpretation. 


CHAPTER  IV 


SUMMAEI 


The  purpose  of  this  study  was  to  determine  whether  or  not 
experimentally  established  examiner  attitudes — negative  and  positive — 
could  influence  intelligence  test  scores.  The  study  was  also  concerned 
with  identifying  examiner  traits  of  the  two  treatment  conditions. 

The  subjects  used  in  this  study  were  forty-eight  male  under- 
graduate students,  who  acted  as  their  own  controls  by  participating  in 
both  the  negative  and  positive  treatment  conditions.  The  examiners  were 
eight  male  graduate  students  in  psychology  sophisticated  in  intelli- 
gence test  administration.  The  examiners  received  specific  instructions 
concerning  their  behavior  during  the  administration  of  the  treatment 
conditions. 

The  measures  of  Ijitelligence  were  two  short  forms  of  the 
Wechsler  Adult  Intelligence  Scale.  These  short  forms  were  composed  of 
five  subtests  each  and  correlated  approximately  ,97  with  full  scale 
WAIS  and  at  least  .94  with  each  other.  Selected  parts  of  the  Chicago  Q 
Sort  were  used  as  a  rating  scale  of  the  examiners. 

The  experiment  followed  a  pattern  of  counterbalancing  in  order 
to  control  for  order  and  sequence  effects.  Each  examiner  tested  twelve 
subjects  and  alternated  negative  and  positive  administrations.  Each 
subject  received  a  negative  and  positive  administration  from  a  different 
examiner  and  alternated  positions  within  the  examiner's  sequence. 
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The  Pentads  received  equal  use  in  both  treatment  conditions  in  order 
to  control  for  biasing  factors  arising  from  Pentad  differences.  At  the 
end  of  the  second  testing  session  the  subjects  filled  out  a  rating  scale 
of  the  two  examiners  who  worked  with  him  in  the  experiment. 

Scale  scores  for  the  subtests  were  used  as  data  in  comparing  the 
effects  of  the  treatments.  An  analysis  of  variance  of  this  data  re- 
vealed no  significant  F's  other  than  the  variance  between  treatments 
approaching  significance.  It  was  felt  that  the  utilization  of  different 
types  of  subject  popu]ations,  other  than  college  sophomores,  might  well 
yield  more  significant  results.  The  majority  of  the  scores  decreased  as 
a  resiilt  of  the  negative  treatment  condition. 

An  analysis  of  the  subtests  separately  revealed  that  Vocabulary 
was  the  one  subtest  vrtiich  demonstrated  a  significant  decrease  as  a 
result  of  the  negative  treatment  condition. 

Chi-squares  of  the  rating  scale  data  demonstrated  clear  differ- 
ences in  the  subjects'  ratings  of  the  same  examiners  in  the  different 
treatment  conditions. 

It  was  felt  that  more  research  with  well- structured  tests,  e.g., 
intelligence  tests,  with  regard  to  interpersonal  and  situational  influ- 
ences should  be  done.  It  was  iirged  that  psychologists  look  upon  the 
testing  situation  as  a  social  situation  capable  of  delivering  moi^e  data 
than  a  pure  psychometric  approach  could  yield.  The  problem  of  personal- 
ized interpretation  was  pointed  out  and  it  was  suggested  that  a  system 
for  acc\irately  and  objectively  handling  such  complexities  be  evolved. 
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